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Abstract — In  this  paper,  we  consider  the  problem  of  detection 
for  dependent,  non-stationary  signals  where  the  non-stationarity 
is  encoded  in  the  dependence  structure.  We  employ  copula 
theory,  which  allows  for  a  general  parametric  characterization 
of  the  joint  distribution  of  sensor  observations  and,  hence, 
allows  for  a  more  general  description  of  inter-sensor  dependence. 
We  design  a  copula-based  detector  using  the  Neyman-Pearson 
framework.  Our  approach  involves  a  sample-wise  copula  selection 
scheme,  which  for  a  simple  hypothesis  test,  is  proved  to  perform 
better  than  previously  used  single  copula  selection  schemes. 
We  demonstrate  the  utility  of  our  copula-based  approach  on 
simulated  data,  and  also  for  outdoor  sensor  data  collected  by  the 
Army  Research  Laboratory  at  the  US  southwest  border. 
Keywords:  Detection,  dependence  modeling,  heterogeneous 
sensing,  model  selection,  sensor  fusion,  information  fusion. 

I.  Introduction 

Fusion  of  data  from  heterogeneous  sources  of  information, 
observing  a  certain  phenomenon,  has  been  shown  to  improve 
the  performance  of  several  inference  tasks.  Two  sensors  are 
said  to  be  heterogeneous  if  their  respective  observation  models 
cannot  be  described  by  the  same  probability  density  function 
(pdf)  [1],  Naturally,  an  information  fusion  system  comprising 
multi-modal  sensors  satisfies  this  definition.  However,  sensors 
of  the  same  modality  too  can  be  heterogeneous,  in  the  sense 
defined  here,  as  they  may  span  varied  deployment  and  manu¬ 
facturing  conditions. 

In  this  paper,  we  consider  the  design  of  false-alarm  con¬ 
strained  detectors  that  operate  in  non-stationary  environments. 
The  non-stationarity  is  assumed  to  manifest  itself  as  time- 
varying  spatial  dependence  across  the  sensors.  This  is  a  plau¬ 
sible  situation,  especially  in  multi-modal  deployments:  based 
on  the  physics  governing  the  individual  modalities,  transient 
phenomena  may  affect  one  modality  more  drastically  than  the 
other.  This  would,  therefore,  cause  the  intermodal  dependence 
to  fluctuate,  but  leave  the  marginal  models  relatively  invariant 
within  the  same  observation  window.  In  other  words,  for 
reasonably  short  observation  windows,  the  signal  from  a  single 
modality  can  be  modeled  as  a  quasi-stationary  process,  an 
approach  that  has  been  used  extensively  in  spectral  analysis 
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and  statistical  signal  processing  [2],  [3];  modeling  cross¬ 
sensor  dependence,  on  the  other  hand,  would  require  a  more 
considered  approach. 

We  use  a  copula-based  approach  to  model  dependence.  Cop¬ 
ulas  are  parametric  functions  that  couple  univariate  marginal 
distribution  functions  to  the  corresponding  multivariate  dis¬ 
tribution  function.  A  copula-based  formulation  is  attractive 
because  observations  may  exhibit  significant  nonlinear  de¬ 
pendence  across  sensors,  which  cannot  be  adequately  char¬ 
acterized  by  a  linear  covariance  matrix.  Many  families  of 
copula  functions  have  been  defined  in  the  literature  to  address 
this  issue.  Further,  while  kernel/learning  based  methods  can 
model  nonlinear  dependence  and  are  known  to  converge  to 
the  true  joint  distribution  asymptotically,  they  also  suffer  from 
scalability  issues  stemming  from  the  curse  of  dimensionality. 
Copulas  are  widely  used  to  model  stochastic  dependence  in  the 
fields  of  econometrics  and  finance  [4]  and  have  been  shown 
to  be  useful  in  various  signal  processing  contexts  [5]— [8] . 

In  the  sections  that  follow,  we  develop  the  above  ideas 
in  more  detail.  Section  II  discusses  the  related  literature  and 
provides  a  brief  discussion  on  copula-based  inference.  The 
detection  problem  is  formulated  in  Section  III.  Copula  selec¬ 
tion  is  an  important  component  of  any  copula  based  inference 
task  and  our  approach,  described  in  Section  IV,  specifically 
addresses  the  issue  of  non-stationarity  dependence.  We  discuss 
our  results  in  Section  V.  We  compare  the  performance  of  our 
detector  to  previously  used  approaches  on  simulated  data  and 
also  evaluate  its  performance  on  seismic  and  acoustic  data 
collected  by  the  U.S.  Army  Research  Laboratory  at  the  US 
southwest  border.  Concluding  remarks  and  a  brief  discussion 
on  the  directions  for  future  research  are  provided  in  Section  VI. 

II.  Background 

A.  Previous  work 

Copula-based  approaches  for  both  centralized  and  dis¬ 
tributed  signal  processing  have  been  studied  recently.  Iyengar 
et  al.  [1]  have  investigated  the  general  framework  of  copula- 
based  detection  of  a  phenomenon  being  observed  jointly  by 
heterogeneous  sensors.  They  quantify  the  performance  loss 
due  to  copula  misspecification  and  demonstrate  that  a  detector 
using  a  copula  selection  scheme  based  on  area  under  the 
receiver  operating  characteristic  (ROC)  can  provide  signifi¬ 
cant  improvement  over  models  assuming  independence.  Their 
results  on  a  NIST  multibiometric  dataset  show  that  the  copula 
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based  approach  is  versatile  and  can  fuse  not  only  heteroge¬ 
neous  sensor  measurements,  but  can  also  be  applied  to  fuse 
different  algorithms.  Sundaresan  et  al.  [9]  consider  the  case 
of  distributed  detection  and  derive  the  optimum  fusion  rules 
for  a  Neyman-Pearson  detector.  Sundaresan  and  Varshney  [10] 
also  design  and  analyze  the  performance  of  a  copula-based 
estimation  scheme  for  the  localization  of  a  radiation  source. 

As  mentioned  above,  we  also  consider  seismic-acoustic 
fusion  as  an  application  of  the  proposed  detector.  The  data 
collected  mimic  typical  scenarios  for  surveillance  of  human 
activity  at  border  crossings.  Signals  due  to  footsteps  can  be 
used  as  surrogates  for  human  presence  when  monitoring  a 
scene  of  interest  using  seismic  or  acoustic  sensors.  A  copula 
based  detector  fusing  seismic  and  acoustic  footstep  data  for 
personnel  detection  in  indoor  environments  has  been  discussed 
by  Iyengar  et  al.  [11],  Their  approach  combines  canonical 
correlation  analysis  (CCA)  and  copula  modeling  to  character¬ 
ize  cross-modal  dependence  in  the  frequency  domain.  Time- 
frequency  analysis  using  the  spectrogram  has  been  shown 
to  effectively  characterize  footstep  information  [12],  Copula 
based  approaches  fusing  indoor  seismic  data  have  also  been 
examined  for  footstep  detection  [13]. 


where,  Ui  =  F,(xi).  Several  copula  functions  are  defined  in  the 
literature,  and  are  constructed  to  characterize  different  types 
of  dependence  [15],  of  which  the  elliptical  and  Archimedean 
copulas  are  widely  used.  Some  of  these  are  listed  in  Table  I. 
While  not  explicitly  specified  in  (1)  and  (2),  copula  functions 
contain  a  dependence  parameter  that  quantifies  the  amount 
of  dependence  between  the  m  random  variables.  We  denote 
the  dependence  parameter  as  </>,  which,  in  general,  may  be  a 
scalar,  a  vector  or  a  matrix. 

An  attractive  feature  of  copulas  is  that  nonparametric  rank- 
based  measures  of  dependence,  such  as  Kendall’s  r,  can  be 
expressed  as  expectations  over  the  copula  distribution.  For 
independent  pairs  of  random  variables  (Xi .  Y\ )  and  (X2.  V2) 
having  the  same  distribution  as  (X,  Y),  concordance  is  defined 
as  the  condition  that  (ATi  —  X2)(Yi  —  Y2)  >  0  and  discordance 
is  defined  as  the  condition  that  (Xk  —  X2)(Y1  —  Y2)  <  0. 
Kendall’s  r  is  defined  to  be  the  difference  between  the 
probabilities  of  concordance  and  discordance: 

r  =  p[(x1-x2)(y1-y2)  >  o]-p[(x1-x2)(yi-y2)  <  o]. 

Nelsen  has  proved  the  relationship  in  (4)  for  a  copula,  C,  and 
random  variables  X  ~  fx(x),Y  ~  fyiv)  [15,  p.  161],  i.e.. 


B.  Copula  theory 

As  stated  in  Section  I,  copulas  are  parametric  functions  that 
couple  univariate  marginal  distributions  to  a  valid  multivariate 
distribution.  They  explicitly  model  the  dependence  among 
random  variables,  which  may  have  arbitrary  marginal  distribu¬ 
tions.  Copula  theory  is  an  outcome  of  the  work  on  probabilistic 
metric  spaces  [14]  and  a  copula  was  initially  defined,  on  the 
unit  hypercube,  as  a  joint  probability  distribution  for  uniform 
marginals.  Their  application  to  statistical  inference  is  possible 
largely  due  to  Sklar’s  theorem,  which  is  stated  below  without 
proof  [15]. 

Theorem  1  (Sklar’s  Tlioerem):  Consider  an  m-dimensional 
distribution  function  F  with  marginal  distribution  functions 
Fi, . . . ,  Frn.  Then  there  exists  a  copula  C,  such  that  for  all 
x\,...,xm  in  [-00,  00] 

F(x  1 ,  X2 ,  ,  Xjyi )  —  C (Fl  (:£l) ,  F2  (^2)  5  •  •  •  5  (*£m))  (1) 


If  Fk  is  continuous  for  1  <  k  <  m,  then  C  is  unique, 
otherwise  it  is  determined  uniquely  on  RanF\  x  . . .  x  RanFm 
where  RanFk  is  the  range  of  cumulative  distribution  function 
(CDF)  Ffc.  Conversely,  given  a  copula  C  and  univariate  CDFs 
i7!, . . .  ,Fm,  F  as  defined  in  (1)  is  a  valid  multivariate  CDF 
with  marginals  F-j, ... ,  Fm. 

Note  that  the  arguments  of  C  in  (1)  are  uniformly  distributed 
random  variables.  As  a  direct  consequence  of  Sklar’s  Theo¬ 
rem,  for  continuous  distributions,  the  joint  probability  density 
function  (pdf)  is  obtained  by  differentiating  both  sides  of  (1), 


f(x  1,  •  •  •  ,xm) 


(‘(h\  (ah  )  5  •  •  •  j  Rm{Xm)) 


(2) 


where,  c  is  termed  as  the  copula  density  and  is  given  by. 


dm(C(u1,  .  .  .,Um)) 
dli\ , . . . ,  dum 


(3) 


T(ct>)  =  AnC^Fx{x),FY{y))}-l.  (4) 


This  relationship  allows  r  to  be  expressed  in  terms  of  the 
dependence  parameter  of  the  copula,  C  (E  for  the  elliptical 
copulas  and  (f>  for  the  Archimedean  copulas  in  Table  I).  For 
the  case  of  elliptical  copulas,  parametrized  by  the  matrix  E  = 
[p*j]> 

Pij  =  sin  (^)  >  (5) 

where  Tij  is  the  Kendall’s  r  evaluated  for  the  pair  (Ui,  Uj )  = 
(Fxi  (•)>  Fxj  (•))■  The  sample  estimate  of  Kendall’s  r,  for  N 
observations,  can  be  calculated  as  the  ratio  of  the  difference 
in  the  number  of  concordant  pairs,  c,  and  discordant  pairs,  d, 
to  the  total  number  of  pairs  of  observations,  i.e.. 


c  —  d 
c  +  d 


c  —  d 


(6) 


Typically,  the  value  of  the  dependence  parameter  is  not  known 
a  priori,  and  <f>  needs  to  be  estimated,  e.g.,  using  maximum 
likelihood  estimation  (MLE).  On  the  other  hand,  (6)  and  (4) 
imply  that  Kendall’s  r  can  be  used  for  calculating  computa¬ 
tionally  efficient  estimates  of  <p. 


III.  Problem  formulation 

Consider  a  scene  or  phenomenon  being  monitored  by 
a  sensor  suite,  consisting  of  L  sensors.  The  i-th  sensor, 
i  =  1,2 makes  a  set  of  N  measurements,  x,j,j  = 
1 . . . ,  N.  These  measurements  may  represent  a  time  series 
(with  j  being  the  time  index),  spectral  coefficients  (with  j 
being  the  frequency  index),  or  some  other  feature  vector. 
The  vector  Xj  denotes  the  j-th  measurements  at  all  sensors, 
i.e.,  xj  =  [x\j,  x2j, ....  Xr,j]T ■  The  collective  measurements, 
x  =  [x1;  x2  . . . ,  Xj, . . . ,  Xjy],  are  received  at  a  processing  unit 
or  fusion  center  (FC).  Based  on  the  joint  characteristics  of  x, 
the  FC  decides  whether  a  phenomenon  is  present  or  absent  in 
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TABLE  I 

Some  copula  functions 


Copulas 

Parametric  Form 

Parameter  Range 

Elliptical 

copulas 

Gaussian 

Student-t 

$e($_1(ui),  •  •  • ,  ^-1(um)),  $e(x)  =  /0X  AT(x;  0,  E  )dx,  xer 

<&— 1(u)  =  inf  {u  <  frf  Mix:  0,  l)dx} 
xeR  ~  u 

tu  Yiitv1  (ui), . . . ,  t^1  (iLm)),  tu  e  :  multivariate  Student- 1  CDF 

t^1  :  inverse  CDF  of  univariate  Student- 1 

S  =  [Pij],i,j  =  1, . . .  ,m 
pij  S  [— 1, 1] 

v  :  degrees  of  freedom, 
v  >  3 

Archimedean 

copulas 

Clayton 

Frank 

Gumbel 

Independent 

(EfciA-l)  * 

li-~.fi  1  n™l  [exp{-^Ui}-l]^ 
<t>  10g  V1  +  exp{-^.}-l  J 

exp{-(E£i(-ln«i)A} 

n?=iui 

<t>  e  [— l,  oo)\{0} 

^  e  R\{o} 

<t>  e  [i,oo) 

the  region  of  interest  and,  thus,  solves  the  following  hypothesis 
testing  problem: 

N 

H0  :  /0(x)  =  /0(xj) 

1=1  (7) 

N  v  ’ 

Hi  ■■  /i(x)  =  n/i(xj), 

i= i 

where  H0  is  the  null  hypothesis  that  the  background  process 
is  observed,  and  Hi  is  the  alternative,  i.e.,  the  phenomenon  of 
interest  is  observed.  The  pdfs  under  the  null  and  alternative 
hypotheses  are,  respectively,  denoted  as  fo  and  fi.  In  taking 
the  product  over  all  j  in  (7),  we  assume  that  for  a  given 
sensor,  signals  are  independent  over  the  index  j,  e.g.,  over 
time.  However,  in  general, 

L 

A(x^n«.  k  =  0,l 

i  —  1 


This  formulation,  therefore,  asserts  that  since  the  sensors 
are  observing  the  same  phenomenon,  at  any  given  instant, 
sensor  measurements  need  not  be  independent  spatially  (across 
sensors). 

Using  Sklar’s  theorem  (Section  II-B,  Theorem  1),  the 
joint  densities  in  (7)  can  be  expressed  in  terms  of  the  copula 
densities,  cq  and  ci,  respectively  under  Hq  and  H  \ ,  as, 


N 

H0  :  /o(x)  =  J] 

j= i 

N 

Hi  ■■  /i(x)  =  n 

j= i 


(n  Mxijidoi)^ 
x  cotu^Goi), . . .  ,u°L3{OoL)\ct)o) 

(n/i(*«i0ii)) 


X  ClKj(01l),...,<i(0ii)|01)  . 

(8) 

The  copula  arguments  are  the  probability  integral  transforms 
(PIT)  of  Xij  under  hypothesis  ///i;,  i.e.,  for  sensor  i  and 


measurement  j, 

Uy(0fci)  =Fk(Xij\eki)  ft  =  0,1.  (9) 

The  quantities  {0o,0i}  and  {4>o i4>i\  in  (8)  are,  respectively, 
the  marginal  density  parameters  and  copula  parameters  under 
{ //(j,  //] }.  When  these  parameters  are  known,  the  likelihood 
ratio  test  (LRT)  is  the  optimal  test.  Equivalently,  we  can 
compare  the  log-likelihood  ratio  (LLR)  to  a  threshold  //, 


where, 

rLR(x)=l0Sli 

N  L 

-VVio 

~hh s/o(*«i®oi)  <n> 

A,  Ci(u}J(0U),...,<J.(0iL)|01) 

+  A  OgCo«.(0oi),...,A(0oL)|0o) 

These  parameters  are  typically  unknown  and  have  to  be 
estimated.  Using  maximum  likelihood  (ML)  estimates  in  place 
of  the  true  parameter  values,  the  test  becomes  a  generalized 
likelihood  ratio  test  (GLRT)  in  the  Neyman-Pearson  frame¬ 
work.  From  (8)  and  (9),  it  is  seen  that  the  copula  density  is 
also  a  function  of  the  marginal  parameter,  0kl,  through  the 
PIT.  Thus,  ideally,  ML  estimation  of  the  parameters  would 
require  simultaneous  maximization  of  the  joint  likelihood 
function  over  both,  the  marginal  and  copula  parameters.  This 
is,  however,  difficult  and  a  consistent  two-step  estimation 
procedure  is  commonly  used  in  copula  literature  [16].  The  two- 
step  maximum  likelihood  (TSML)  procedure  first  maximizes 
the  individual  marginal  likelihoods  over  each  0/,.,: 

N 

Gki  =  arg  max  ^  fk  (xtj  \  0ki)  (12) 

&ki 

3  =  1 


T’lr(x)  ^  r], 
h0 


f.  ao.  .\ 


(10) 
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The  second  step  of  TSML  substitutes  9ki  =  0 kl  in  (9);  the 
copula  likelihood,  thus  obtained,  is  then  maximized  over  <f>k, 
i.e., 

nki3  =  n%{Gkl) 

N 

4>k  =  arg  max  V  ck  (ut , 

3= 1 

The  GLRT  then  can  be  expressed  as, 

^glrWIE  (15) 

Ho 


1)  We  assume  that  /0,  the  marginal  density  families  under 
//(i,  are  known  for  each  i  =  1, . . . ,  L.  The  corresponding 
marginal  parameters,  0<n,  may  be  unknown. 

2)  The  Hq  copula  family,  co,  is  known  but  (f){]  may  be 
estimated,  if  needed.  This  section,  however,  assumes, 
without  loss  of  generality,  that  Co  =  1,  i.e.,  mea¬ 
surements  under  Hq  are  independent  across  sensors. 
However,  the  discussion  is  valid  for  any  known  Co .  The 
independence  under  the  null  hypothesis  also  allows  us 
to  simplify  our  notation;  we  do  not  explicitly  notate  for 
Hi  with  respect  to  the  copula  function.  Therefore,  we 
set 


(13) 

■  ■  ■  i‘^Lj\<t>k)-  (14) 


where. 


N  L 

?glr(x)  =EE  log 

j= 1  i=l 


h(xij\9u) 
fo  (%ij  |^0i) 


+  y^lo  c1Kj(fl11),---,<j(01L)|$1) 

h  ogcoKJ(0oi),...,<J(eoL)|$o) 


(16) 


Alternatively,  for  the  bivariate  case  ( L  =  2),  we  can  also  use 
the  sample  estimate  of  Kendall’s  r,  defined  in  (6),  to  estimate 
4>k.  Noting  that  the  relation  in  (6)  is  invertible,  we  rewrite  the 
function  relationship  between  r  and  cj)k  in  (4),  in  terms  of  a 
function  gk  so  that. 


ci(-)  =  c(-) 

{9 ki  )  =  Uij 

4>i  =  4> 

3)  The  copula  under  the  alternative,  c\,  is  not  known  a 
priori.  The  “best”  copula,  in  the  sense  of  maximum 
likelihood,  is  selected  from  a  predefined  library  of 
copulas,  C  =  {cm  :  m  =  1, . . . ,  M}. 

Based  on  these  assumptions,  we  discuss  three  detection 
scenarios:  detection  with  known  parameters,  detection  with 
unknown  parameters,  and  detection  with  unknown  marginals 
under  Hi  and  unknown  copula  parameters. 


T  =  9k{4>k)  (17) 

=>■  =9k1(T)-  (18> 

The  resultant  estimate  of  <pk  is  given  by 

4> k=9k1{r )■  (19) 

Further,  since  t  is  a  consistent  estimator  of  r  [17],  cj)k  —> 
4>k  as  N  — >  oo.  For  finite  N,  using  cf)k  instead  of  <f>k,  in 
(16),  results  in  a  sub-optimal  test,  but  a  simpler  estimation 
procedure. 

IV.  Detection  under  non- stationary  dependence 

In  Section  I,  we  motivated  the  need  to  consider  non¬ 
stationary  dependence.  While  the  preceding  section  assumes 
that  the  family  of  copulas,  Co  and  c\,  are  known,  a  formulation 
with  non-stationary  dependence  has  to  necessarily  drop  that 
assumption.  In  the  following  discussion,  we  assume  that  the 
background  model  can  be  predetermined  to  some  degree:  the 
family  of  the  marginals  is  known  and  Cq  is  known.  The  more 
general  case  of  unknown  co  is  considered  by  Iyengar  et  al.  [1], 
but  signal  detection  for  such  a  scheme  would  need  to  be 
implemented  under  a  training-testing  paradigm.  However,  non- 
stationarity  notwithstanding,  the  true  underlying  copula  under 
H\,  c,  is  typically  not  known;  this  “true  copula”  is  usually 
abstracted  as  a  single  copula,  but  it  may,  in  fact,  be  a  composite 
of  several  copulas  interacting  in  an  indeterminate  fashion, 
accounting  for  the  non-stationary  nature  of  observations.  Due 
to  these  complexities,  copula  selection  is  an  important  part  of 
copula  based  inference  and  several  copula  selection  methods 
have  been  proposed  [1],  [7],  [13].  Our  assumptions  are  stated 
more  precisely  as  follows: 


A.  Detection  with  known  parameters 


For  some  applications,  it  may  be  feasible  to  determine,  a 
priori,  the  value  of  the  copula  parameter  cf>m  for  each  cm  €  C. 
The  actual  selection  of  the  copula  may  be  done  online.  For 
this  case  the  test-statistic  is  formulated  as  a  modification  of 


(11), 


Tlr(x)  =  log 


/l(x) 

/o(x) 


N 


=  EEloS 


3= 1  i= 1 


fi(xij\9u) 

fo{zij\9oi) 


(20) 


N 

+  E  log  c*3  (Uli>  •  •  •  > 

3-1 


where  for  each  j  the  maximum  copula  likelihood  is  selected 
from  the  library,  i.e., 


c*  =  maxC  (21) 

J  CmEC 

4>*  =  arg  c*  (22) 

The  key  difference  here  is  that  previous  papers  have  proposed 
selecting  a  single  copula  for  the  entire  observation  window  [1], 
[13],  i.e.,  choose 


cjsj  =  maxC  (23) 

Cm 

V? 

</*N  =  arg  CN-  (24) 

On  the  other  hand,  we  select  the  best  copula  for  each  j 
adapting  to  potentially  changing  dependence  structure.  Denote 
the  Kullback-Leibler  (KL)  divergence  between  the  pdfs  g  and 
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h  as  D(g\\h).  We  now  prove  that  selecting  the  best  copula  for 
each  j,  as  opposed  to  a  single  best  copula  for  all  N,  leads  to 
a  smaller  KL  divergence  from  the  single  true  copula,  c. 


Proposition  1:  Let  X  ~  /x(x),  X  €  HlNxL,  where, 

N 

/x(x)  =  II 

1= i 

where  c  is  the  true  copula.  For  the  copula  library, 

C  =  {cm  :  m  =  1,.. .  ,M},  (26) 

and  selection  schemes  (21)  and  (23), 

D(fr\\fc*)<D(fx\\fc.J,  (27) 

where  and  fc^  are  the  joint  densities  for  X  under  Hi 
using  (21)  and  (23)  respectively. 

Proof:  Consider  the  case  M  =  2.  Choosing  C\  over  c2 
when  ci  >  C2  is  equivalent  to  the  decision  rule  when  copula 
selection  is  posed  as  a  decision  problem  with  equally  likely 
copulas.  Let  represent  the  sample  space.  Let  Plm  C  Pi 
represent  the  decision  region  for  x  for  which  cm(m  =  1,2) 
is  chosen,  so  that  fli  U  fl2  =  Pi  and  fli  fl  Pi2  —  0.  Denote  the 
product  of  marginals  as  /p(xj),  i.e., 

L 

/P(xl)  =Y[fxi{xij\61 


Jfx^fOu)  c(uij,...,uLj\(p) 


(25) 


'1  i) 


i= 1 


Also,  define  the  following  sets: 

J\  =  {j  :  x7  £  fli}  and  J2  =  {j  :  £  fl2}- 

Then, 


Wx"/,.)  =  /logf^ 


N 


=  /  log^ - dFx 


n  /p(xi)nci(-i0i)nc2(#2) 

i=i  ji  Ji 


N 


=  /  ^logc(-)rfi?X 


1=1 


5>gci(.)  +  5>gc2(-) 
.  Jl  Ji 


dF* 


(28) 

The  selection  criterion  in  (21),  implies  that,  for  the  set  J2, 
c2  >  Ci.  Therefore, 

5>gci(.)  +  El0Sc2(-)  >  ^logCi(-)+Elc,gci(-) 

Jl  Ji  Jl  Ji 

N 


=  Elogci(')> 

1=1 


(29) 


and  in  a  similar  manner, 


N 


Elogci(-) +ElogC2(‘)  -  ElogC2(-) 

Ji  Ji  1=1 


Therefore,  depending  on  whether  ci  or  c2  was  chosen  using 
(23),  we  can  substitute  either  of  the  inequalities  in  (29)  or  (30) 
into  (28)  to  get, 

DifrWfy)  <  DifrWfa) 

This  proves  (27)  for  M  =  2.  For  M  >  2  we  can  successively 
partition  Pl2  and  arrive  at  a  similar  result  by  repeating  the 
above  steps.  ■ 

Proposition  1  implies  that  a  detector  using  the  proposed 
selection  scheme  in  (21)  will  suffer  a  lower  loss  in  detection 
performance  due  to  copula  misspecification  [1], 


B.  Detection  with  unknown  parameters 

With  unknown  parameters,  the  statistic  in  (16)  for  the 
composite  hypothesis  testing  problem  can  be  rewritten  as. 


7glr(x)  =EE  log  - 

J=1  i=l  Jo(xij\('Oi) 


N 


(31) 


+  E  l0g  C*J  (uif  (01l)>  •  •  •  >  ULj  {OilMj), 
1=1 


where  the  TSML  procedure  has  been  used  to  obtain  estimates 
of  marginal  and  copula  parameters.  The  copula  parameters  <pm 
are  estimated,  over  all  N,  for  each  cm  £  C,  so  that 

C  =  {Cm(4>m(N)):rn=l,...,M}  (32) 

c*  =  maxC  (33) 

J  Cm&C 

J  =  aigc*  (34) 


While  this  selection  method  is  motivated  by  the  implications 
of  Proposition  1  for  the  simple  hypothesis  case,  a  similar  result 
may  not  be  stated  for  the  composite  test.  This  is  because  ML 
estimation  requires  that  all  N  samples  be  drawn  from  the  same 
population;  this  need  not  hold  true  for  copula  selection  from 
C  with  unknown  parameters. 

The  copula  parameters  can  also  be  estimated  using  t .  The 
test-statistic  is  then. 


rf(x)=VViog/l(a:|j|eil) 

N 

+  E  l0gc!-(uy  (0n),  •  ■  ■  ,  ULj(61L)\<f>*), 
1=1 


(35) 


where  </>  •  is  the  estimate  of  <p*  based  on  f .  Correspondingly, 

C  =  (cm(0m)  :  m  =  i, . . .  ,M}  (36) 

c*  =  maxC  (37) 

3  cmG  C 

4>*=argc*  (38) 


C.  Detection  with  unknown  marginals  and  unknown  copula 
parameters 

In  many  applications,  establishing  a  model  under  IJ\  is 
not  feasible.  In  that  case,  fi  is  determined  non-parametrically 
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and  Uij  is  obtained  using  the  empirical  probability  integral 
transform  (EPIT).  The  test  statistic  is,  therefore,  expressed  as, 


fl{Xi 


N  L 

TnpmM  =  EE  log  - 

j=l  i=l  jo{xij\0Oi) 


N 


(39) 


+  S^j\ogc*{u1j,...iULj  |$*), 


i=i 


where  c*  and  associated  parameters  are  selected  as  indicated 
in  (36),  (37)  and  (38).  The  uniform  random  variables  in  the 
copula  density  are  evaluated  using  EPIT, 


1  .  , 

=  jy  E^«7<M 

J  =  1 

Uij  —  Fi  ( Xij  ) 


(40) 

(41) 


where  I  is  the  indicator  function. 

The  marginal  model  under  Hi  is  determined  through  a 
kernel  density  estimation  procedure.  Kernel  density  estima¬ 
tors  [18]  provide  a  smoothed  estimate,  /i  (xi:] ) ,  of  the  true 
density.  Choosing  the  correct  bandwidth  for  kernel  density 
estimation  is  important  for  an  accurate  estimate.  The  ker¬ 
nel  bandwidth  is  chosen  using  leave-one-out  cross-validation. 
The  selected  bandwidth,  h*,  is  the  minimizer  of  the  cross- 
validation  estimator  of  risk,  J,  for  a  kernel,  K.  The  risk  esti¬ 
mator  may  be  easily  computed  using  the  approximation  [18, 
p.  136], 

p  q  (42) 

+  Mx(0)  +  o(]^)  ’ 

where, 

K*(x)  =  kW(x)  -  2K(x) 

K(2\z)  =  J  K(z-y)K(y)dy. 

The  Gaussian  kernel  was  selected,  so  that  K{x)  =  Mix:  0, 1) 
and  I\(2\z)  =  A f{z\ 0,  2).  Therefore, 


h* 


argmin  J{h) 
h 


V.  Results  and  discussion 


TABLE  II 

Distribution  of  marginals  for  simulation  experiments 


i 

Ho 

Hx 

i 

tv(o,  i) 

TV(0.1,1.1) 

2 

Beta(2. 0,2.0) 

Beta(2.2,2.2) 

Frank  copulas.  For  all  the  cases,  we  compare  performances 
obtained  when  testing  with  Xglr  in  (31),  Tf  in  (35),  GLR 
using  single  copula  selection  and  the  product  rule,  i.e.,  inde¬ 
pendence  assumption.  The  results  presented  are  averaged  over 
104  Monte-Carlo  trials  with  1000  samples  per  trial. 

In  Figs.  1,2  and  3  receiver  operating  characteristics  (ROC) 
for  different  generating  copulas  are  shown.  In  Fig.  1,  we 
consider  the  case  in  which  a  t  copula  is  used  to  generate  the 
data.  Note  that  this  represents  the  case  where  the  true  copula 
is  not  known  and  is  not  included  in  the  copula  library.  The 
label  non-stationary  copula  refers  to  the  sample-wise  selection 
scheme  proposed  in  this  paper.  Fig.  2  represents  the  case  where 
half  of  all  x^  were  simulated  with  a  Gaussian  copula  and 
the  remaining  half  consisted  of  samples  generated  from  the 
Frank  copula.  This  is,  therefore,  the  case  where  the  copula 
library  also  accommodates  the  generating  model.  The  case  of 
a  single  generating  copula  that  is  also  a  member  of  the  library 
is  also  considered;  Fig.  3  shows  results  when  all  the  data  are 
generated  using  the  Frank  copula. 

For  all  simulation  scenarios  we  observe  that  the  GLRT 
and  the  test  based  on  t,  using  our  selection  scheme  perform 
comparably.  Both  outperform  the  single  copula  selection  and 
product  rules.  We  note  that  these  results  represent  the  un¬ 
known  parameter  case  (Section  IV-B),  for  which  we  were 
not  able  to  prove  that  our  method  would  outperform  the 
single  copula  selection  method.  An  intuition  for  why  we 
observe  this  result  is  that  since  f  is  consistent,  for  large 
TV,  t  — >  t.  Also,  one  value  of  t  corresponds  to  different 
values  of  </>m  =  argcm(-),cm  £  C.  We  conjecture  that, 
asymptotically,  this  is  as  if  the  parameter  values  are  known, 
allowing  Proposition  1  to  be  applicable.  This  implies  that 
while  r  controls  the  amount  of  dependence,  which  remains 
unchanged  for  all  TV,  different  copulas  represent  the  shape 
of  the  dependence  between  the  data  from  the  two  sensors. 
Verifying  this  conjecture  mathematically  is  difficult,  and  will 
be  addressed  in  our  future  work. 


In  this  section,  we  present  results  when  the  copula-based 
tests,  discussed  in  Section  IV,  are  applied  to  simulated  and 
real  data.  Our  results  are  presented  for  a  two-sensor  case,  i.e., 
L  =  2.  We  note,  however,  that  the  methods  described  above 
apply  generally  to  L  >  2,  as  one  can  construct  a  multivariate 
copula  using  bivariate  components  [13]. 

A.  Simulated  data 

We  simulated  normal  and  beta  distributed  marginals  and 
considered  various  cases  of  copula  dependence.  The  marginals 
and  the  respective  parameters  used  are  tabulated  in  Table  II. 
For  all  copula  cases  we  used  Kendall’s  r  =  0.2  to  specify 
dependence.  The  copula  library  contains  the  Gaussian  and 


B.  ARL  footstep  data 

We  used  the  footstep  data,  made  available  by  the  US  Army 
Research  Laboratory  (ARL),  collected  at  the  US  southwest 
border.  The  dataset  consists  of  raw  observations  from  several 
sensors  of  different  modalities  that  were  deployed  in  an 
outdoor  space  to  record  human  and  animal  activity  that  is 
typical  in  perimeter  and  border  surveillance  scenarios.  The 
participants  in  the  data  collection  exercise  walked/ran  along 
a  predetermined  path  with  sensors  laid  out  along  either  side 
of  the  path.  In  this  paper,  we  consider  copula-based  seismic- 
acoustic  fusion. 

Seismic  and  acoustic  time  series  for  activities  representing 
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Fig.  1.  ROCs  for  Hi  data  generated  using  a  t  copula  Fig-  3.  ROCs  for  Hi  data  generated  using  a  Frank  copula 


Fig.  2.  ROCs  for  Hi  data  generated  using  Frank  and  Gaussian  copulas 


a  single  person  walking,  two  persons  walking  and  human 
leading  an  animal  (among  other  examples)  are  available  in 
the  ARL  dataset.  Each  seismic/acoustic  time  series  contains  a 
leading  60s  of  background  data.  We  use  this  as  our  Ho  data. 
The  data  are  sampled  at  10kHz,  and  are  mean  centered  and 
oscillatory  in  nature. 

Before  applying  the  copula-based  detector,  we  first  pre- 
process  the  data.  The  time  series  is  split  into  non-overlapping 
frames  of  length  T  =  512.  This  raw  time  series  data  is  called 
XTi(t)  where  i  =  1,2  is  the  sensor  index  for  the  acoustic 
and  seismic  modalities  respectively,  and  t  is  the  time  index. 
In  keeping  with  Houston’s  analysis  that  Fourier  spectra  for 
seismic  and  acoustic  footstep  data  are  more  informative  than 
time-domain  measurements  [12],  we  set 

Xij  =  \f  J  {xTi{t)}2 , 

where  J7  is  the  DFT  and  j  =  1  =  256  is  the 

frequency  index.  Our  sensor  measurements  are,  therefore,  now 


transformed  to  the  frequency  domain  and  the  statistics  of 
x  =  [Aj]  are  used  as  the  input  to  the  detector.  The  copula 
library  consists  of  Gaussian,  Gumbel  and  Frank  copulas.  We 
have  observed  that  due  to  the  interstitial  nature  of  footstep 
data,  including  the  independence  copula  (Table  I)  in  the  library 
improves  the  overall  detection  performance. 

For  the  ARL  dataset,  we  use  the  statistic  7npm(x)  in 
(39).  To  generate  ROCs,  we  compare  the  test-statistic  to  a 
vector  of  thresholds.  The  curve  thus  generated,  for  the  case 
when  Hi  corresponds  to  one  person  walking,  is  shown  in 
Fig.  4.  This  curve  is  compared  to  the  ROCs  for  single  copula 
selection  scheme  as  well  as  the  product  rule,  i.e.,  independence 
assumption  for  H\.  Similar  ROCs  are  obtained  for  the  cases 
of  two  persons  walking,  and  man  leading  an  animal  and  are 
shown  in  Fig.  5  and  Fig.  6. 

For  all  the  three  cases,  we  observe  that  our  proposed 
method,  using  the  sample-wise  copula  selection  for  non¬ 
stationary  data,  outperforms  the  ROCs  corresponding  to  single 
copula  selection  and  independence.  We  further  observe  that, 
the  two-persons  and  man-leading-animal  cases  have  a  higher 
probability  of  detection  (Pd)  for  a  given  probability  of  false 
alarm  (Pj?),  when  compared  to  the  one  person  case.  This 
is  intuitive,  since  for  the  two-persons  case  and  man-leading- 
animal  case,  we  have  a  higher  signal  to  noise  ratio. 

VI.  Conclusion 

In  this  paper,  we  have  considered  a  detection  problem, 
with  dependent  heterogeneous  sensor  data.  We  used  a  copula- 
based  approach  to  model  the  inter-sensor  dependency  and 
applied  our  scheme  to  detect  non-stationary  phenomena.  We 
considered  a  specific  type  of  non-stationarity  that  affects  the 
inter-modal  dependence  more  severely  than  the  individual 
sensor  model.  A  copula-based  approach  was  used  to  design 
detectors  for  dependent,  non-stationary  data.  We  have  shown 
that,  for  a  simple  hypothesis  testing  problem,  a  sample-wise 
copula  selection  scheme  performs  better  than  selecting  a  single 
copula  for  the  entire  observation  window.  However,  a  similar 
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Fig.  4.  ROCs  for  the  ARL  dataset  for  1  person  vs.  background  detection. 


Fig.  5.  ROCs  for  the  ARL  dataset  for  2  persons  vs.  background  detection. 


conclusion  cannot  be  formally  stated  for  the  GLRT.  We  note 
that  the  copula  parameters  can  be  estimated  using  distribution- 
free,  rank  based  methods  such  as  Kendall’s  r  and  observed  that 
using  the  sample  estimate,  f,  gave  comparable  performance 
to  MLE-based  detection.  This  motivates  future  investigation 
into  whether  Proposition  1  can  be  generalized  to  the  case  of 
unknown  parameters.  Empirical  results  are  encouraging,  and 
support  this  idea.  We  note  that  our  method  on  simulated  data 
performs  favorably  even  when  the  true  copula  is  not  a  part  of 
the  library.  Similarly,  results  on  acoustic  and  seismic  datasets 
also  show  that  our  method  yields  superior  performance. 
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